Mapping protein information to disease terminologies

نویسندگان

  • Anaïs Mottaz
  • Yum Lina Yip
  • Patrick Ruch
  • Anne-Lise Veuthey
چکیده

In order to improve the accessibility of genomic and proteomic information to medical researchers, we have developed a procedure to link biological information on proteins involved in diseases to the MeSH and ICD-10 disease terminologies. For this purpose, we took advantage of the manually curated disease annotations in more than 2,000 human protein entries of the UniProt KnowledgeBase. We mapped disease names extracted from the entry comment lines or from the corresponding OMIM entry to the MeSH. The method was assessed on a benchmark set of 200 manually mapped disease comment lines. We obtained a recall of 54% for 91% precision. The same procedure was used to map the more than 3,000 diseases in Swiss-Prot to MeSH with comparable efficiency. Tested on ICD-10, the coverage of the mapped terms was lower, which could be explained by the coarse-grained structure of this terminology for hereditary disease description. The mapping is provided as supplementary material at http://research.isbsib.ch/unimed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

I-49: Human Y Chromosome ProteomeProject

The success of the Human Genome Project (HGP) has provided a blueprint for the approximately 20,000 gene-encoded proteins potentially active in all of the hundreds of cell types that make up the human body. Yet we still have limited knowledge about a majority of the gene-encoded proteins which are the “building blocks of life” and “cellular machinery”. It is estimated that for nearly half of th...

متن کامل

Leveraging Terminological Resources for Mapping between Rare Disease Information Sources

BACKGROUND Rare disease information sources are incompletely and inconsistently cross-referenced to one another, making it difficult for information seekers to navigate across them. The development of such cross-references established manually by experts is generally labor intensive and costly. OBJECTIVES To develop an automatic mapping between two of the major rare diseases information sourc...

متن کامل

Mapping biomedical terminologies using natural language processing tools and UMLS: mapping the Orphanet thesaurus to the MeSH

Background: Orphanet aims to provide rare disease information to healthcare professionals, patients, and their relatives. Objective: The objective of this work is to evaluate two methodologies (UMLS and manual Orphanet-ICD-10 link-based mapping & String Based matching) used to map Orphanet thesaurus to the MeSH thesaurus. Results: On a corpus of 375 mappings, the string based matching provides ...

متن کامل

Mapping de terminologies diagnostiques en cancérologie par l'intermédiaire du NCI Metathesaurus

Many terminologies are used in France for coding cancer diagnoses (ICD-10 for Diagnosis Related Groups, ICDO3 in cancer registries, ADICAP for pathological anatomy). This heterogeneity largely hinders the use of registries’ data. It is thus necessary to integrate the diagnostic terminologies in oncology into a unified and structured system. The NCI Thesaurus is an international terminology, whi...

متن کامل

Standard Anatomic Terminologies: Comparison for Use in a Health Information Exchange–Based Prior Computed Tomography (CT) Alerting System

BACKGROUND A health information exchange (HIE)-based prior computed tomography (CT) alerting system may reduce avoidable CT imaging by notifying ordering clinicians of prior relevant studies when a study is ordered. For maximal effectiveness, a system would alert not only for prior same CTs (exams mapped to the same code from an exam name terminology) but also for similar CTs (exams mapped to d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • J. Integrative Bioinformatics

دوره 4  شماره 

صفحات  -

تاریخ انتشار 2007